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ABSTRACT _ 

Research on reading teacher effect iveness has taken 
several different di rect ions over the past 36 years: during the 19565 
and most of the i960s, research focused primarily on teacher 
qualitiesj^ in the early i97es attention shifted to the effect of the 
teaching process on student iearh ihg , while in_ the late 197Ds and. 
early 19865 experiments defined more specif ically. factors irivclved in 
teacher effect iveness . Experimental design as well as means bfdata 
collection and analysis also altered_duririg this time. The isolated 
classroom bbservatabh of the 1960s, for example> was replaced in the 
1970s by more direct observation in classroom settings. Despite 
distinct improvements;, such as refinements in the determination of 
dependent and independent variables in teaching effectiveness, the 
generalizability of significant findings continues to be limited by 
methbdological and experimental design problems • (MM) 



************************************************ 

* Reprbductibns supplied by EDRS are the best that can be made * 

* frbm the original docximent. * 

********************************************************************* 



ERLC 



Reading. Effect iveness. Research: _ 
General izabil ity of Significant Findings 



Wi 1 1 lanj H. Rupley 
Texas ASM University 



Beth S. Wise 
McNeese State University 



0:s: DEPARTMSNT OF EDUCATION 

NATIONAL INStltOTE OF EDOCATION 

EDOCATIONAL RESOUfiCeS INFORMATION 

CENTER (ERIC) 
>^ Thii ttocunuint has tman roproducod as 
r«;ct;iverl frotn Ihi: person or orcjnni^iit'On 
PnyirttHiny l. . 

Minor chiinijici \\avv. been nuidii to improve 
ri'pruduclioii quti'iW- 

• PoiniC 61 View or opinions stiiled in thib ddciii 
mfint tio not rn'ciMiSCUily reprt)s«:rit official NIE 
position oi policy 



:'P£RMlSSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 

Wil i iam H. Rupley 



TO THE EDUCATIONAC RESOURCES 
iNFORMATION CENTER {ERiC)." 



InstructiDrial Research Laboratory 
Educational Curriculum & Instruction 
Texas ASM University, Colleqe Station, TX. 
Technical Series # R83002 



September, 1983 



2 



Wi 1 1 iaiti H...Rup1ey 
Associate Professor 
Texas A&M University 
Department of EdCI 
College Station, TX 77843 



ERIC 



Reading Teacher Effectiveness Research: 
Generalizability of Significant Findings 



William H. Rupley Beth S. Wise 

Texas ASM University McNeese State University 



Historically, research in effective teaching has taken several 
different directions over the past 30 years. Within this span of 
inquiry, three distinct time periods are identifiable: (1) 1950$ and 
1950s, (2) early 197Gs, and (3) mid 19705 to the present (Duffy, 1980) 
Collectively^ many of the research findings from these time periods 
can be linked with effective reading instruction because student 
reading achievement often served as the dependent variable (Centra 
and Potter, 1980). Al though. such research has expanded the. know- 
ledge base in reading teacher effectiveness, there are several 
issues related to variable specification and generalizability of 
findings that need to be considered in both the extant literature 
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and future research (Shavelson and Russo, 1977): 

Major threats to the general i zabi 1 ity of research findings in 
reading taaeher ef feeti veness can come from a variety of sources; 
however, those that appear to be pervasive throughout the past 30 
years are related to variables under investigation, data gathering 
procedures^ and analyses. 

iM-id 1950s and 1960s 

The primary focus of teacher effectiveness research during the 
1950s and most of the 1960s was on the qualities of the teacher. A 
quality that was extensively -investigated was teachers' personality. 
Getzel and 'Jackson (1963*) reported that over 1000 studies of teacher 
personality had been conducted in the late forties and early fifties 
They concluded that after more than fifty years of inquiry effort 
little is known about the effects of teacher personality and effec- 
tive instruction. 

In addition to using personality as ah independent variable^ 
the relationship of other teacher characteristics and instructidrial 
methods to students' achievement were also explored. Sex, education 
race, and years of experience are examples, of independent variables 
that researchers' attempted to link to effective teaching. 

Researchers used a variety of procedures to gather data on 
teacher characteristics, instructional methods, student character- 
istics and student achievement. Questionnaires, surveys, rating 
scales, and observation instruments were administered to teachers, 
supervisors, and students in a search for significant relationships 
with student achievement. Classroom observations were used to study 
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predeterinined teacher behaviors assumed to be associated with student 
achievenient. Usually these observations focused on verbal interactions 
using Flanders Interaction Analysis Categories (Flanders, 1960) and 
classroom variables dealing with emotional climate, verbal emphasis, 
and student-initiated activity employing instruments Such as the 
Observation Schedule and Record (Medley and Mitzel, 1958). 

An investigation of elementary reading during the mid sixties 
also'Dsed direct classroom observation to study teachers' implemen- 
tation of a single method of teaching begi luii rig Veadi ng (Chall and 
■ Feldman, 1966). Students' reading gains served as the dependent 
variable: Data gathered from the direct observations and question- 
naires were the independent variables. 

Findings from the majority of these research investigations 
lacked external validity. Critics of this research attacked it as 
being isolated and remote from the actual classroom (Cogan, 1963). 
Wall en and Travers (1953) felt that for progress to go Forward, 
theory should precede practice in teacher effectiveness research; 
however, their call went unanswered well into the late 1 95Qr. . 

Although in the late sixties some attenti on was given to gathering 
data ill naturalistic classroom settings, data were often limited to 
behaviors that were part of the content of the instrument used, which 
did not allow for the systematic recording of classroom events that 
occurred outside? of the specified content. Furthermore, minimal 
attention, if any, was given to reliability ,,i'id vdlioity of obser- 
vation systems (Rupley and Mangano, 1982) and other data gathering 
procedures, such as questionnaires and rating scales. Teachers' 
reports of classroom instruction and supervisors' ratings of teachers' 
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effectiveness v^ere assumed to be accurate indicators of what actually 
occurred in classroom reading instruction. Such methodological flaws 
resulted in a maijor threat to the general i zabi 1 i ty of any significant 
findings and offered little application of results to either preservice 
or inservice teacher training programs. 

Early 19Zas- 

Several important developments occurred during the early seventies 
that provided a more cohesive direction for the study of teacher 
effectiveness. Major reviews of past research were conducted (Dunkin 

and Biddle, 1974; McNeil and Popham, 1973; Rosenshine and Furst, 1973) 

- - ■ - / 
which changed the direction of -research focus. 9ne /major change was 

a focus on the process of teaching in relation to i,ts effect on the 

product, which was students' learning. / 

Another significant event that reshaped the focus of teacher 
effectiveness research was the funding of a number of major inves- 
tigations between 1972 and 1975 by The National Institute of Edu- 
cation. Among the funded investigations were those that focused on 
(1) effective education of disadvantaged children (Soar and Soar, 
1972), (2) stability of teacher effectiveness (Brophy and Everston, 
1974), and (3) specification of effective teaching behaviors 
(Berliner, 1975). 

Methodological features of teacher effectiveness research were 
also reconceptualized . Pretesting and posttesting to determine 
students' adjusted mean achievement in the basic skill areas of 
math and reading were being used as the dependent variables., Inde- 
pendent variables related to students' adjusted mean achievement 



Mi 11 iam H. Rupley 5 



weriB tieachers' i hstrUcti dhal behaviors ^ students' behaviors and 
students' socio-economic standing. Data collection was becoming 
more classroom crientied, and more direct observation in natural 
settings was being used to record instructional activities, 
students' behavior, and cl-assroom environments. 

Taken as a group, the studies conducted in the early 1970s 
varied in the types of teachers and students included, the kinds of 
variables addressed, and the methods used. There was, however, 
replication of findings in some of the studies, even though many 
of these were more poorly des.igned than others. Correlation analyses 
was the primary means of data etnalysis and the findings were in the 
middle ranges (Duffy^ 1980.). 

General i zabi 1 i ty of significant results^ however ^ was consider- 
^b^y limited due to major methodological flawsv^:- Data gathering proce- 
dures, although more classroom focused than in the preceding time 
period, still lacked major consideration being given to reliability 
and validity issues. Observer agreement was the only reliability 
issue addressed and observations in some studies were limited to 
only one or two episodes (Rupley and Mangano, 1982). Consistency 
across studies did help lend validity to the findings, hov/ever, 
Rdsenshine (1977^ 1978) cautioned against implementing these initial 
findings into teacher training programs before validity had been 
established. Finally, a serious threat to external validity was 
the statistical analysis employed in ttie majority of inquiries. 
With a range of lOO to 1000 measures in a single study, many signi- 
ficant correlations were obtained by chance. Furthermore, some 
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investigators applied significantly more measures in their analyses 
than were actually studied; which violated the assumptions under- 
lying the statistical tests they used (Centra and Potter, 1980). 

Late 1970s to present 

Several major reviews of the process-product research conducted 
in the late sixties and early seventies helped to further refine 
and define the direction of teacher effectiveness research (Rosenshine, 
1977, 1978; Nedley, 1977-, Brophy, 1979). A notable outcgme was the 
more precise identification of the factors under investigation. Speci- 
fication of independent variables such as teacher-directed ihstruc- 
tion, pupil engagement, classroom management, and so forth became 
more common across investigations. 

Another major thrust was the emergence of classroom based experi- 
mental studies of teacher effectiveness. These experimental efforts 
were designed to test the validity of results of the large scale 
correlation studies conducted in the early seventies (Andersen, 
Everston, and Brophy, 1979; Good and Grouws , 1977; Stallinns, Needels 
and Stayrock, 1979). This new experimental focus had two major stages 
(Gage and Giaconia, 1981). The first stage consisted of training one 
group of teachers to employ process variables associated with ef fee- ■ 
tive instruction and a. ithholding training from a comparable qroup 
of teachers. Teachers* use of process variables were measured through 
direct observation in the teachers' classroom. The second stage was 
characterized by using observation data as the independent variables 
and students' product outcome, such as reading achievement, as the 
dependent .variable. 
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The experimental focus of teacher effectiveness research holds 
cbrisiderable promise for more accurate specification of instructional 
processes that cause student learning; However, the findings from the 
major studiies using such a methodology still have limited general- 
izability and are open to question about causal influences (Anderson, 
1979). 

Factors under observation were not uniforiTily defined^ thus, 
what was student engagement in one study may have been coded as a 
different behavior . in other studies. Reliability was most often 
addressed in terms of i nter-cbserver agreement, which only provides 
a coefficient for agreement between observers. For example, a major 
violation occurring frequently was to train observers until they 
reached ah established criterion for major categories which fails to 
account for the range of variation of each behavior within that 
category . 

Little attention was given to establishing general i zabi 1 i ty 
coefficients for each subcategory of an observation system. General- 
izability theory assumes that the sample is equivalent to a set of 
possible combinations of the conditions for which observations can be 
made. Observations made within particular facet are generalizable 
to other similar situations. The purpose of establishing general- 
izability is to determin.e the degree of variability for each facet. 
Without researchers address ing this issue in the development and use 
of their observation systems, significant findings will continue to 
have limited general i zabi 1 ity. 

Inappropriate or vveak experimental designs and data analyses 
continue to limit the external validity of recent teacher effective- 
ness research findings. G^^oups in some inquiries are not comparable 
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nor adequately described.- Data were in some studies analyzed by dis- 
crete behaviors for each major category, but observers' agreement was 
determined by their reaching criterion for the overall major category. 
Finally, statistical tests also v/ere violated. Fifty-five one vvay 
analyses of variances were conducted in one inquiry (Anderson ^ Everstdhj 
and Brophy, 1979) and significarit process variables at the .05 level 
wgre reported. In this instance, significant £ values would have had 
to be equal to or less than .0009. 

As noted in Figure 1, major changes have occurred in the factors 
investigated and the dat^ gathering procedures employed in reading 
teacher effectiveness research. The general i zabi 1 i ty of significant 
findi ngs cdnti nues to be limited by methodological and experimental 
design problems. However^ it has been suggested (Gobd^ 1979) that 
outcomes from experimental studies will not ever be predictable, 
since several teacher behaviors could be used to create the same 
effect, and identical teachers could have different impacts on 
different students. Anderson (1979) has suggested that since it is ' 
impossible to control for all the variables, the classroom researcher 
must somehow reach a compromise between scientific theory and classroom 
rieality. 

Admittedly^a compromise is a fact of cl assroom^ research j however^ 
a compromise of the research tbdls - design, data gathering procedures, 
and analyses - should not be major threats to the general- 
izability of significant results. The importance of researchers giving 
careful attention to these research tools becomes even more important 
when the magnitude of teacher effects on student reading achievement 
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is cbhsidered . McDonald's research (1975) has attributed 35 percent 
of the variance of students' end-of year reading achievement to teacher 

effects-^ therefore, the magnitude of effect of a bingle process variable 

• " ■ - - \- _ _ _ _ - 

on student achievement is going to be extremely small. 



insert Figure 1 here 
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1: Major tlireats to. the peralizability of 
research in relation to factors investigated 
selected time periods. 



researclfihdihgs io reading teacher. effectiveness 
and data gathering procedures tjtilized during 



Time Periods 



Factors Investigated 



Data Gathering Procedures 



Hajor Threats to General izability 



Late 1950s 
1960s 



0- -Student Attitudes 
Student Achievement 

1- -Teacher Personality 



Self-Reports & (lues ti on- 
naires 

Supervisor's Rati'hg. 
Scales 



ithodological-lack of reliability, 
and validity of data gathering • 
procedunis and inappropriate experi- 
mental designs. 



teri sties 

Teacher Instruction 
Methods/materials 



Students' Rating Scales 

Teacher's Perceptions 

Glass room Observation of 
Predetermined Teacher 
Behaviors (usually verbal 
interactions and personal- 
ity factors) 



^^^3^16 specification-lack of 
attention to validity of indepen- 
dent variables. 



Early 19/05 



D"Student Attitudes 



Achievement 



I-- 



dn 



SES 



Self-reports S Question- 
naires 

Classroom Observation 
of Behavior 



Hethodologieal-laek of reliability 
and validity of data gathering pro- 
cedures and. inappropriate or weak 
experi[ 



Variable Specification-lack of 
attention to validity 



Hid 19/Os to 



0-Student Attitudes 



Achievement 



Classroom 
of Predeterinined 
Teacher and 
ior in 



on 



1) 



I-Teacher .Instruction 
Student Behavior 
SES 
School 



Classroom 
That is 
Events and 
ressioiis 



on 



on 
ve of 



ical-lack.of careful 
to reliability and 
of data gathering pro- 
and weaknesses. in experi- 
designs and analyses. 



'S 
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